% Combined i186/Z80 design
\documentclass{article}

% Too long to type :)
\newcommand{\tosh}{Toshiba TLCS-900H}

\begin{document}
\title{Combined i186/Z80 backend design}
\author{Michael Hope michaelh@earthling.net.nz}
\date{\today}
\maketitle

\begin{abstract}
There is much similarity between the Zilog Z80, Nintendo GBZ80, Intel
i186, and \tosh processors.  This document describes the design of a
backend consisting of a register allocator and set of extendable code
generators for SDCC which can target all of these processors.
\end{abstract}

\section{Motivation}
The primary motivation is to add a i186 backend to SDCC, and in the
process come to understand register allocation.  The secondary goal is
to attempt to combine the common parts from the Z80 and i186 backends
to make both easier to maintain.  In the 'would be nice' category is
adding support for the \tosh.

\section{Processor descriptions}

\subsection{Zilog Z80}
\begin{tabular}{l|l|l}
	Name	& Parts	& Notes \\ \hline
	AF	& A	& Accumulator and flags.  Unusable as a pair. \\ \hline
	BC	& B, C	& \\ \hline
	DE	& D, E	& \\ \hline
	HL	& H, L	& \\ \hline
	IX	& None	& Index register. \\ \hline
	IY	& None	& Index register. \\
\end{tabular}

The Z80 also has a switchable alternate register set AF', BC', DE',
and HL' which are not accessible directly.  It is assumed that it is
too hard to track these to make it worthwhile.  IX and IY can be split
at the byte level by using undocumented instructions.  While this
would make them more usable as general purpose registers, it is
ignored for compatibility.

\subsection{Nintendo GBZ80}
The GBZ80 is basically a Z80 less the index registers and the
alternate register set. \\
\begin{tabular}{l|l|l}
	Name	& Parts	& Notes \\ \hline
	AF	& A	& Accumulator and flags.  Unusable as a pair. \\ \hline
	BC	& B, C	& \\ \hline
	DE	& D, E	& \\ \hline
	HL	& H, L	& \\
\end{tabular}

\subsection{Intel i186}
\begin{tabular}{l|l|l}
	Name	& Parts		& Notes \\ \hline
	AX	& AH, AL	& Accumulator. \\ \hline
	BX	& BH, BL	& \\ \hline
	CX	& CH, CL	& \\ \hline
	DX	& DH, DL	& \\ \hline
	DI	& None		& Destination Index. \\ \hline
	SI	& None		& Source Index. \\ \hline
	BP	& None		& Base pointer. \\
\end{tabular}

Note that the segment registers CS, DS, ES, and SS are not listed.
For simplicity only tiny mode is supported.  Tiny mode is where both
the data and code exist in one segment, such that CS = DS.  This
allows constant data stored in the code segment to be accessed in the
same method as data.  This may cause trouble later if far mode is
needed.

\subsection{\tosh}
The 900H seems to be inspired by the Z80.  It is listed as a 16 bit
device, but most of the registers are 32 bits wide. \\
\begin{tabular}{l|l|l}
	Name	& Parts		& Notes \\ \hline
	XWA	& WA, W, A	& Accumulator \\ \hline
	XBC	& BC, B, C	& \\ \hline
	XDE	& DE, D, E	& \\ \hline
	XHL	& HL, H, L	& \\ \hline
	XIX	& IX		& \\ \hline
	XIY	& IY		& \\ \hline
	XIZ	& IZ		& \\
\end{tabular}

All registers can act as either an index base or an index offset.  The
offset is limited to sixteen bits.  Apparently XIX, XIY, and XIZ can
be split at the byte level.  For simplicity this is ignored.

\section{Common features}
\subsection{Stack}
The stack grows downwards.  All push operations are pre-decrement.
All but the GBZ80 have a base pointer register that can be used along
with an offset to access local variables and parameters.  The GBZ80 and
Z80 both have problems with more than 127 bytes of local variables due
to their offset being a INT8.

\subsection{Registers}
All contain a reasonable but small number of registers.  All have some
special purpose registers which can either be handled separately or
ignored.  All general purpose registers can be split at the byte and
word level, but doing so may make the rest of the register
unavailable.

\subsection{Memory}
All have a 64k address space.  However this should not be assumed to make
far support easier later.

\section{Design}
\subsection{Basics}
The design is mainly stack based, such that all local variables and
parameters go onto the stack.  Compare with the mcs51 backend which
overlays variables onto a shared static area.

All stack access goes through the base pointer register (BP).  The
stack frame consists of the parameters pushed right to left, the
return context, and then local variables.  SP points to the base of
the local variables.  BP points to the bottom of the return context
and hence to just above the top of the local variables.  Note that as
the stack grows down the parameters appear with the left most
parameter at the lowest address.

A scratch register will be available for any sub operation to use and
will be valid only within that sub operation.  The accumulator is also
unavailable.  Return values are normally returned in the scratch
register and accumulator.

\begin{tabular}{l|l|l|l|l}
	Name		& i186	  & Z80	    & GBZ80  & 900H 	\\ \hline
	Base pointer 	& BP	  & IX	    & HL     & XIX 	\\ \hline
	Scratch		& BX	  & HL	    & DE     & XHL	\\ \hline
	Return		& BX, AX  & HL, IY  & DE, HL & XHL	\\ \hline
	Available	& CX, DX  & BC, DE  & BC     & XBC, XDE	\\ \hline
	Ignored		& SI, DI  & IY	    & None   & XIY, XIZ	\\
\end{tabular}

\subsection{Register allocator}
The current Z80 and mcs51 register allocators perform these steps:
\begin{enumerate}
	\item Pack each basic block by removing straight assignments and marking
remat. iTemps.
	\item Set the number of registers required for each live range based on
the type and size of the live range.
	\item Assign registers to each segment by deassigning registers and stack
locations for any just expired iTemps, skipping any instructions which don't need
registers, and spilling and assigning registers to the result of this tuple.
	\item Create the register mask for this segment.
\end{enumerate}

Optimisations include assigning into the accumulator or the scratch
register where possible.  This requires knowledge of what the code
generator touches for a given instruction.

The first generation register allocator will only pack assignments and mark
remat. variables.  Only the register management is processor specific.  The
allocator may ask for a given size register or if a given size register is
available.  Note that only whole registers may be returned.  For example, 
allocation will fail if a sixteen bit register is requested and no pair
is available, even two eight bit registers are available.  Note that on
the Z80, GBZ80, and i186 a request for a 32 bit register will always fail.

\subsection{Code generator}
The possible operations are:
\begin{itemize}
	\item NOT - Logical not.  0 -> 1, others -> 0.
	\item CPL - Bitwise complement.
	\item UMINUS - Unary minus.  result = 0 - left.
	\item IPUSH - Push immediate onto the stack.
	\item CALL - Call a function.
	\item PCALL - Call via pointer.
	\item FUNCTION - Emit the function prelude.
	\item ENDFUNCTION - Emit the function prologue.
	\item RET - Load the return value and jump to end of function.
	\item LABEL - Generate a local label.
	\item GOTO - Jump to a local label.
	\item Arithmitic - +, -, *, /, \%.
	\item Comparison - LT, GT, LEQ, GEQ, !=, =.
	\item Logical - \&\&, ||
	\item Binary - AND, OR, XOR.
	\item Shift - RRC, RLC, LSR, LSL.
	\item Pointer - Set and Get.
	\item Assign.
	\item IF jump.
	\item Misc - Jump table, cast, address of.
\end{itemize}

\end{document}